Symmetric Active/Active High Availability for High-Performance Computing System Services
نویسندگان
چکیده
منابع مشابه
Symmetric Active/Active High Availability for High-Performance Computing System Services
This work aims to pave the way for high availability in high-performance computing (HPC) by focusing on efficient redundancy strategies for head and service nodes. These nodes represent single points of failure and control for an entire HPC system as they render it inaccessible and unmanageable in case of a failure until repair. The presented approach introduces two distinct replication methods...
متن کاملTowards High Availability for High-Performance Computing System Services: Accomplishments and Limitations∗
During the last several years, our teams at Oak Ridge National Laboratory, Louisiana Tech University, and Tennessee Technological University, focused on efficient redundancy strategies for head and service nodes of high-performance computing (HPC) systems in order to pave the way for high availability (HA) in HPC. These nodes typically run critical HPC system services, like job and resource man...
متن کاملRecovery Schemes for High Availability and High Performance Cluster Computing
Clusters and distributed systems offer two important advantages, viz. fault tolerance and high performance through load sharing. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers break down the load on these computers must be redistributed to other computers in the cluster. The redistribution is determined by t...
متن کاملSystem Management Services for High-Performance In-situ Aerospace Computing
With the ever-increasing demand for higher bandwidth and processing capacity of today’s space exploration, space science, and defense missions, the ability to efficiently apply commercial-off-the-shelf technology for on-board computing is now a critical need. In response to this need, NASA’s New Millennium Program office has commissioned the development of the Dependable Multiprocessor for use ...
متن کاملEvent Services for High Performance Computing
The Internet and the Grid are changing the face of high performance computing. Rather than tightly-coupled SPMD-style components running in a single cluster, on a parallel machine, or even on the Internet programmed in MPI, applications are evolving into sets of collaborating elements scattered across diverse computational elements. These collaborating components may run on different operating ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computers
سال: 2006
ISSN: 1796-203X
DOI: 10.4304/jcp.1.8.43-54